今年三月的時候在臉書看到了紐約時報發布“How to Virus Got Out?”讓我十分印象深刻,從三月初疫情爆發後歐美各國的卻整與死亡人數節節攀升,有關新冠肺炎的消息漫天蓋地,為了解各國疫情,此次連結covid19即時資訊,運用leaflet map呈現出各國疫情嚴重程度與covid19每日最新資訊,讓閱讀者一目了然。
為了避免疫情擴大,各國積極宣導防疫措施,部分公司為避免疫情擴大便安排員工在家工作,其中,義大利的嚴重程度僅次於中國,而成人網站Pornhub便藉此機會,宣布義大利用戶於今年三月間皆可免費使用原本需收費的Premium「高級服務」,而這優惠雖僅限於義大利用戶,然而在pornhub公告的全球用戶流量報告中,消息公告後網站流量的增長卻超出義大利用戶數。 此次將分為關鍵字與各地區之比較,關鍵字比較以Pornhub、FC2、avgle、18av觀察特定地區對於以上四個成人網站的搜尋次數,地區比較則以Pornhub為主要關鍵字,比較各地區的搜尋次數。
covid19_xlsx="https://covid.ourworldindata.org/data/owid-covid-data.xlsx"
tmp = tempfile(fileext=".xlsx")
httr::GET(url=covid19_xlsx, httr::write_disk(tmp))
covid19 = readxl::read_excel(tmp, sheet=1, col_names=TRUE, skip=0)
#如果有當天的資料就存取當天的資料,若沒有就存前一天的
#all(covid19$date==Sys.Date())==FALSE:
#如果date那行全為false表示沒有當日資料(if 結果為true)便抓前一天的資料
if (all(covid19$date==Sys.Date())==FALSE) {
ori=subset(covid19,date==Sys.Date()-1)
}else {
ori=subset(covid19,date==Sys.Date())
}df$popup = paste0("Date:",df$date,"<br>",
"Country:", df$ADMIN,"<br>",
"Total cases: ", df$total_cases,"<br>",
"Total deaths: ", df$total_deaths,"<br>",
"New cases: ", df$new_cases,"<br>",
"New deaths: ", df$new_deaths,"<br>",
"CVD19 death rate: ", df$cvd_death_rate,"<br>")
df = df[!is.na(df$cvd_death_rate),]
df_test =df[df$ADMIN == "Taiwan",]
df$cvd_death_rate = as.numeric(df$cvd_death_rate)
qpal = colorQuantile("Set3",df$cvd_death_rate, n=12)
map <- leaflet(df) %>%
addPolygons(popup=~popup,
stroke=FALSE,smoothFactor=0.2,fillOpacity=1,
color=~qpal(df$cvd_death_rate),
highlightOptions = highlightOptions(color = "white",weight = 2,bringToFront = TRUE),
labelOptions = labelOptions(style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",direction = "auto"))%>%
addLegend(pal=qpal,values=df$cvd_death_rate,opacity = 0.7,title ="cvd19 death rate", position = "bottomright")
map-由於windows的編碼不符合google的UTF-8格式,需先轉換文字編碼
-Encodeing:用來查看邊碼格式
Encoding(keyword)
# google => r encoding 查看字的編碼格式
# => https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/Encoding1.enc2utf8(keyword) 2.Encoding(keyword) = “UTF-8”
keyword = enc2utf8(keyword)
Encoding(keyword) = "UTF-8"
# convert elements of character vectors to the UTF-8
# keyword = enc2utf8(keyword), Encoding(keyword) = "UTF-8" 兩種寫法擇一即可。
Encoding(keyword)
# "UTF-8" "UTF-8"
# kewords 中文字若不是 "UTF-8" 格式,可以利用 enc2utf8(keyword) 轉換。
# 若 Encoding(keyword) 為 "unknown" "unknown" => 之後查詢結果為 NULL
#除了改自的編碼還要改系統編碼才能做查詢-除了文字的編碼格式外也要轉換統編碼,要設定為locale=“English”,才能查詢中文字 (亦即 “UTF-8” 格式),把電腦的作業系統語系改成英文才能搭配字體的UTF-8格式。
# Sys.getlocale() 檢查目前系統的語系 (編碼)
# MAC 大約如下:
# "zh_TW.UTF-8/zh_TW.UTF-8/zh_TW.UTF-8/C/zh_TW.UTF-8/zh_TW.UTF-8" 或
# "en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8"
# Windows 大約如下:
Sys.setlocale()
# "LC_COLLATE=Chinese (Traditional)_Taiwan.950;LC_CTYPE=Chinese (Traditional)_Taiwan.950;LC_MONETARY=Chinese (Traditional)_Taiwan.950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Taiwan.950"
# locale=Chinese (Traditional)_Taiwan.950,無法查詢中文字,繁體字版本Taiwan.950
# => Windows 的系統的語系 (編碼) 為 Chinese (Traditional)_Taiwan.950,並非 "UTF-8"
Sys.setlocale(category="LC_ALL", locale="English")
# "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"time = "2020-01-01 2020-06-04"
start_date = "2019-12-01"
end_date = Sys.Date()-1
time = paste(start_date, end_date)
# 時間格式一定要如此
trends = gtrendsR::gtrends(keyword,
geo=geo,
gprop=gprop,
time=time)
# 有時無法取到當天 => Error: Cannot parse the supplied time format.
#geo:搜尋位置(美國USA),gprop:哪個領域的統計資料-用names(trends)可以顯示出trends提供的訊息: 1.“interest_over_time” 2.“interest_by_region” 3.“interest_by_country” 4.“interest_by_dma” 5.“interest_by_city” 6.“related_topics” 7.“related_queries”
-onlyInterest = FALSE:回傳以上資訊 TRUE:只會回傳“interest_over_time”
names(trends)
onlyInterest = FALSE #TRUE:只會回傳"interest_over_time",FALSE:六個都會回傳
trends = gtrendsR::gtrends(keyword,
geo=geo,
gprop=gprop,
time=time,
onlyInterest=onlyInterest)
names(trends)
# onlyInterest: TRUE/FALSE,是否只回傳時間序列資料 (interest_over_time),
plot(trends)
# 若是將 time 參數的設定拉長,出來的結果不是週資料就是月資料。Sys.setlocale(category="LC_ALL", locale="Cht")
Sys.setlocale(category="LC_ALL", locale="C")
Sys.setlocale()
# "LC_COLLATE=Chinese (Traditional)_Taiwan.950;LC_CTYPE=Chinese (Traditional)_Taiwan.950;LC_MONETARY=Chinese (Traditional)_Taiwan.950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Taiwan.950"
# Sys.setlocale(category="LC_ALL", locale="Cht") 或
# Sys.setlocale(category="LC_ALL", locale="Cht") 或
# Sys.setlocale() 擇一即可,
head(trends$interest_over_time)
head(trends$interest_by_country)
head(trends$interest_by_region)
head(trends$interest_by_dma)
head(trends$interest_by_city)
head(trends$related_topics)
head(trends$related_queries)
Encoding(trends$related_queries$value) = "UTF-8"
trends$related_queries$value = enc2utf8(trends$related_queries$value)
head(trends$related_queries)
related_queries = trends$related_queries-把圖表對應的真實數值轉換出來,但hits底下<1的數會不跑出來,在google內<1表示查不到
library(highcharter)
interest_over_time$date = as.Date(interest_over_time$date)
interest_over_time$hits = as.numeric(interest_over_time$hits)
#因為有的套件不支援文字型態的日期,需轉換為時間或數字型態。
# 改變 interest_over_time$date 和 interest_over_time$hits 型態。
hchart(interest_over_time,
type="line",
hcaes(x=date, y=hits))
# google => highcharter color group =>
# https://github.com/jbkunst/highcharter/issues/282
fig_highcharter = hchart(interest_over_time,
type="line",
hcaes(x=date, y=hits, group=keyword),
color=c("blue", "red","green","purple")) %>%
hc_rangeSelector(enabled=TRUE) %>%
hc_add_theme(hc_theme_ft()) #風格改變
fig_highcharter# Google Trends (gtrendsR) 搜尋熱度的趨勢變化
# 這些數字代表搜尋字詞在特定區域和時間範圍內的熱門程度變化趨勢,
# 以圖表中的最高點做為比較基準。
# 100 分代表該字詞的熱門程度在該時間點達到最高峰。
# 50 分表示該字詞的熱門程度為最高點的一半,
# 0 分則表示該字詞熱門程度的資料不足。
# Google Trends Index 是個相對的指數,而不是絕對的實際搜索量 (actual search volume)。
# Google Trends Index 的計算原理是:
# 先找出 keyword 在一段時間內最高的實際搜索量 (假設是 50 萬),
# 然後將這段時間的每一天的實際搜索量都除以這個 50 萬,
# 最後再乘以 100,就得到了最終的 Index。
# 這個過程和我們常見的 "標准化" 的過程是一樣的。
# 也因此,搜索量最高的那天的 Index 會是 100。
# 對於這個計算方法來說,時間框架具有極其重要的意義。
# 舉個例子,假設你只關心 3 天的數據,這三天的實際搜索量分別是 50、30、20。
# 那 Google Trends Index 就會顯示:100、60 (即 30/50*100)、40 (即 20/50*100)。
# 但如果縮短時間框架,只關心這三天中的最後兩天的數據:30 和 20,
# 那看到的 Google Trends Index 就會是:100 (30/30*100) 和 67 (20/30*100)。
# 發現這兩天的 Index 就會和三天的 Index 不同,
# 因為,在 2 天的時間框架中,最高實際搜索量變成了 30,而不是之前的 50 了。
# Google Trends 的資料最早從 2004 年 1 月開始,
# 資料的頻率會依所設定的參數而有所改變,最細可以到每分鐘一筆資料,最大可以到月資料,
# - "now 1-H": 過去 1 個小時,資料頻率為 1 分鐘;
# - "now 4-H": 過去 4 個小時,資料頻率為 1 分鐘;
# - "now 1-d": 過去 1 天,資料頻率為 8 分鐘;
# - "now 7-d": 過去 7 天,資料頻率為 1 小時;
# - "today 1-m": 過去 30 天,資料頻率為 1 日;
# - "today 3-m": 過去 90 天,資料頻率為 1 日;
# - "today 12-m": 過去 12 個月,資料頻率為 1 週 (7 天);
# - "today+5-y": 過去 5 年,資料頻率為 1 週,這也是預設值;
# - "all": 從2004年1月至今,資料頻率為 1 個月;
# - "Y-m-d Y-m-d": 時間區間,格式必須要符合,兩個日期間要有空格
# (如,"2019-04-01 2019-05-01"),資料頻率會視區間長度而有所不同。
# However, google currently limit the time resolution based on the query』s time frame.
# For example, query for the last 7 days will have hourly search trends
# (the so-called real time data),
# daily data is only provided for query period shorter than 9 months and up to 36 hours
# before your search (as explained by Google Trends FAQ),
# weekly data is provided for query between 9 month and 5 years,
# and any query longer than 5 years will only return monthly data.
rm(list=ls(all=TRUE))
library (gtrendsR)
# gtrends(keyword = NA, geo = "", time = "today+5-y",
# gprop = c("web", "news", "images", "froogle", "youtube"),
# category = 0, hl = "en-US", low_search_volume = FALSE,
# cookie_url = "http://trends.google.com/Cookies/NID")
keyword = "pornhub"
Encoding(keyword)
# google => r encoding
# => https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/Encoding
keyword = enc2utf8(keyword)
Encoding(keyword) = "UTF-8"
# convert elements of character vectors to the UTF-8
# keyword = enc2utf8(keyword), Encoding(keyword) = "UTF-8" 兩種寫法擇一即可。
Encoding(keyword)
# "UTF-8"
# kewords 中文字若不是 "UTF-8" 格式,可以利用 enc2utf8(keyword) 轉換。
# 若 Encoding(keyword) 為 "unknown" "unknown" => 之後查詢結果為 NULL
geo = c("IT", "ES","PT","US")
countries = gtrendsR::countries
geo_code = sort(unique(countries$country_code))
countries_TW = countries[countries$country_code == "IT",]
countries_TW = na.omit(countries_TW)
gprop = "web"
# gprob 為 Google 產品類別,可以從 5 種不同的產品來選擇,
# 有 "web"、"news"、"images"、"froogle"、"youtube","web" 則是預設值。
# Sys.getlocale() 檢查目前系統的語系 (編碼)
# MAC 大約如下:
# "zh_TW.UTF-8/zh_TW.UTF-8/zh_TW.UTF-8/C/zh_TW.UTF-8/zh_TW.UTF-8" 或
# "en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8"
# Windows 大約如下:
Sys.setlocale()
# "LC_COLLATE=Chinese (Traditional)_Taiwan.950;LC_CTYPE=Chinese (Traditional)_Taiwan.950;LC_MONETARY=Chinese (Traditional)_Taiwan.950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Taiwan.950"
# locale=Chinese (Traditional)_Taiwan.950,無法查詢中文字
# => Windows 的系統的語系 (編碼) 為 Chinese (Traditional)_Taiwan.950,並非 "UTF-8"
# => Windows 需改變設定 locale,Mac 應不需要修改 locate (???)。
Sys.setlocale(category="LC_ALL", locale="English")
# "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
# 需要設定 locale="English",才能查詢中文字 (亦即 "UTF-8" 格式)
time = "2020-01-01 2020-06-04"
start_date = "2020-01-01"
end_date = Sys.Date()-1
time = paste(start_date, end_date)
# 時間格式一定要如此
trends = gtrendsR::gtrends(keyword,
geo=geo,
gprop=gprop,
time=time)
# 有時無法取到當天 => Error: Cannot parse the supplied time format.
names(trends)
# [1] "interest_over_time" "interest_by_country" "interest_by_region" "interest_by_dma"
# [5] "interest_by_city" "related_topics" "related_queries"
onlyInterest = FALSE
trends = gtrendsR::gtrends(keyword,
geo=geo,
gprop=gprop,
time=time,
onlyInterest=onlyInterest)
names(trends)
# onlyInterest: TRUE/FALSE,是否只回傳時間序列資料 (interest_over_time),
plot(trends)
# 若是將 time 參數的設定拉長,出來的結果不是週資料就是月資料。
Sys.setlocale(category="LC_ALL", locale="Cht")
Sys.setlocale(category="LC_ALL", locale="C")
Sys.setlocale()
# "LC_COLLATE=Chinese (Traditional)_Taiwan.950;LC_CTYPE=Chinese (Traditional)_Taiwan.950;LC_MONETARY=Chinese (Traditional)_Taiwan.950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Taiwan.950"
# Sys.setlocale(category="LC_ALL", locale="Cht") 或
# Sys.setlocale(category="LC_ALL", locale="Cht") 或
# Sys.setlocale() 擇一即可,
# 更改 R 的預設語系為繁體中文 (locale="Cht"),中文字才能正確顯示。
head(trends$interest_over_time)
head(trends$interest_by_country)
head(trends$interest_by_region)
head(trends$interest_by_dma)
head(trends$interest_by_city)
head(trends$related_topics)
head(trends$related_queries)
Encoding(trends$related_queries$value) = "UTF-8"
trends$related_queries$value = enc2utf8(trends$related_queries$value)
head(trends$related_queries)
related_queries = data.frame(trends$related_queries)
interest_over_time = trends$interest_over_time
# interest_over_time$hits[is.na(interest_over_time$hits)] = 0
interest_over_time$hits[interest_over_time$hits == "<1"] = 0
interest_over_time$hits = as.numeric(interest_over_time$hits)
# Data table
DT::datatable(interest_over_time)
# plotly 畫圖
# https://www.rdocumentation.org/packages/plotly/versions/4.9.2.1/topics/plot_ly
# https://plotly.com/r/line-charts/
library(plotly)
fig_plotly = plotly::plot_ly(interest_over_time,
x=~date,
y=~hits,
color=~geo,
colors=c("blue", "red"),
type = 'scatter',
mode='lines')
fig_plotly
# highcharter 畫圖
library(highcharter)
interest_over_time$date = as.Date(interest_over_time$date)
interest_over_time$hits = as.numeric(interest_over_time$hits)
# 改變 interest_over_time$date 和 interest_over_time$hits 型態。
hchart(interest_over_time,
type="line",
hcaes(x=date, y=hits))
# google => highcharter color group =>
# https://github.com/jbkunst/highcharter/issues/282
fig_highcharter = hchart(interest_over_time,
type="line",
hcaes(x=date, y=hits, group=geo),
color=c("blue", "red","green","purple")) %>%
hc_rangeSelector(enabled=TRUE) %>%
hc_add_theme(hc_theme_ft())
fig_highcharter## Loading required package: sp
## rgdal: version: 1.5-8, (SVN revision 990)
## Geospatial Data Abstraction Library extensions to R successfully loaded
## Loaded GDAL runtime: GDAL 3.0.4, released 2020/01/28
## Path to GDAL shared files: C:/Users/cherl/Documents/R/win-library/3.6/rgdal/gdal
## GDAL binary built with GEOS: TRUE
## Loaded PROJ runtime: Rel. 6.3.1, February 10th, 2020, [PJ_VERSION: 631]
## Path to PROJ shared files: C:/Users/cherl/Documents/R/win-library/3.6/rgdal/proj
## Linking to sp version:1.4-2
## To mute warnings of possible GDAL/OSR exportToProj4() degradation,
## use options("rgdal_show_exportToProj4_warnings"="none") before loading rgdal.
## ### Welcome to rworldmap ###
## For a short introduction type : vignette('rworldmap')
## 149 codes from your data successfully matched countries in the map
## 0 codes from your data failed to match with a country code in the map
## 94 codes from the map weren't represented in your data
## Response [https://covid.ourworldindata.org/data/owid-covid-data.xlsx]
## Date: 2020-06-16 23:31
## Status: 200
## Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
## Size: 3.18 MB
## <ON DISK> C:\Users\cherl\AppData\Local\Temp\Rtmpy2AoUK\file16042a9e2bd.xlsx
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
## [1] "LC_COLLATE=Chinese (Traditional)_Taiwan.950;LC_CTYPE=Chinese (Traditional)_Taiwan.950;LC_MONETARY=Chinese (Traditional)_Taiwan.950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Taiwan.950"
## [1] "C"
## date hits keyword geo time gprop category
## 1 2019-12-01 23 avgle TW 2019-12-01 2020-06-16 web 0
## 2 2019-12-02 20 avgle TW 2019-12-01 2020-06-16 web 0
## 3 2019-12-03 16 avgle TW 2019-12-01 2020-06-16 web 0
## 4 2019-12-04 14 avgle TW 2019-12-01 2020-06-16 web 0
## 5 2019-12-05 18 avgle TW 2019-12-01 2020-06-16 web 0
## 6 2019-12-06 20 avgle TW 2019-12-01 2020-06-16 web 0
## NULL
## location hits keyword geo gprop
## 1 Taoyuan City 100 avgle TW web
## 2 New Taipei City 94 avgle TW web
## 3 Kaohsiung City 94 avgle TW web
## 4 Taipei City 90 avgle TW web
## 5 Taichung City 88 avgle TW web
## 6 Tainan City 76 avgle TW web
## NULL
## location hits keyword geo gprop
## 1 Zhongshan District 100 avgle TW web
## 2 Zhongli District 86 avgle TW web
## 3 Banqiao District 76 avgle TW web
## 4 Xitun District 71 avgle TW web
## 5 Beitun District 70 avgle TW web
## 6 Sanchong District 70 avgle TW web
## NULL
## subject related_queries value geo keyword category
## 1 100 top av TW avgle 0
## 2 17 top press TW avgle 0
## 3 14 top jable TW avgle 0
## 4 12 top avgle \344\270\213\350\274\211 TW avgle 0
## 5 12 top ave TW avgle 0
## 6 7 top av01 TW avgle 0
## subject related_queries value geo keyword category
## 1 100 top av TW avgle 0
## 2 17 top press TW avgle 0
## 3 14 top jable TW avgle 0
## 4 12 top avgle <U+4E0B><U+8F09> TW avgle 0
## 5 12 top ave TW avgle 0
## 6 7 top av01 TW avgle 0
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:httr':
##
## config
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Highcharts (www.highcharts.com) is a Highsoft software product which is
## not free for commercial and Governmental use
## Warning: `parse_quosure()` is deprecated as of rlang 0.2.0.
## Please use `parse_quo()` instead.
## This warning is displayed once per session.
## Warning: `group_by_()` is deprecated as of dplyr 0.7.0.
## Please use `group_by()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `select_()` is deprecated as of dplyr 0.7.0.
## Please use `select()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `as_data_frame()` is deprecated as of tibble 2.0.0.
## Please use `as_tibble()` instead.
## The signature and semantics have changed, see `?as_tibble`.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: `rename_()` is deprecated as of dplyr 0.7.0.
## Please use `rename()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## [1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
## [1] "LC_COLLATE=Chinese (Traditional)_Taiwan.950;LC_CTYPE=Chinese (Traditional)_Taiwan.950;LC_MONETARY=Chinese (Traditional)_Taiwan.950;LC_NUMERIC=C;LC_TIME=Chinese (Traditional)_Taiwan.950"
## [1] "C"
## date hits keyword geo time gprop category
## 1 2020-01-01 54 pornhub IT 2020-01-01 2020-06-16 web 0
## 2 2020-01-02 54 pornhub IT 2020-01-01 2020-06-16 web 0
## 3 2020-01-03 53 pornhub IT 2020-01-01 2020-06-16 web 0
## 4 2020-01-04 53 pornhub IT 2020-01-01 2020-06-16 web 0
## 5 2020-01-05 51 pornhub IT 2020-01-01 2020-06-16 web 0
## 6 2020-01-06 53 pornhub IT 2020-01-01 2020-06-16 web 0
## NULL
## location hits keyword geo gprop
## 1 Lombardy 100 pornhub IT web
## 2 Tuscany 99 pornhub IT web
## 3 Emilia-Romagna 98 pornhub IT web
## 4 Marche 98 pornhub IT web
## 5 Liguria 98 pornhub IT web
## 6 Piedmont 94 pornhub IT web
## location hits keyword geo gprop
## 1 Corpus Christi TX 100 pornhub US web
## 2 Greenwood-Greenville MS 99 pornhub US web
## 3 Wheeling WV-Steubenville OH 98 pornhub US web
## 4 Great Falls MT 95 pornhub US web
## 5 Odessa-Midland TX 95 pornhub US web
## 6 Minot-Bismarck-Dickinson(Williston) ND 93 pornhub US web
## location hits keyword geo gprop
## 1 Guidonia NA pornhub IT web
## 2 Giugliano in Campania NA pornhub IT web
## 3 Alessandria NA pornhub IT web
## 4 Pesaro NA pornhub IT web
## 5 Livorno NA pornhub IT web
## 6 Pistoia NA pornhub IT web
## NULL
## subject related_queries value geo keyword category
## 1 100 top porno pornhub IT pornhub 0
## 2 53 top gay pornhub IT pornhub 0
## 3 48 top video pornhub IT pornhub 0
## 4 33 top pornhub premium IT pornhub 0
## 5 24 top pornhub italia IT pornhub 0
## 6 24 top pornhub italiano IT pornhub 0
## subject related_queries value geo keyword category
## 1 100 top porno pornhub IT pornhub 0
## 2 53 top gay pornhub IT pornhub 0
## 3 48 top video pornhub IT pornhub 0
## 4 33 top pornhub premium IT pornhub 0
## 5 24 top pornhub italia IT pornhub 0
## 6 24 top pornhub italiano IT pornhub 0
由於時區的不同,每個國家資料更新的時間也不同,output僅呈現當天的部分。 運用Shiny做的covid19地圖僅用單一選項呈現。
以“avgle”,“FC2”,“pornhub”,“18av”四個成人網站作為觀察標的,觀察台灣的搜尋次數,大概也可以知道台灣傾向哪個成人網站與風格。
以Pornhub為關鍵字,地區設定為義大利、葡萄牙、西班牙及美國,將開始日期設為2019-12-1,可以明顯看出三月份義大利、葡萄牙、西班牙的搜尋次數都比美國來的高很多,可見pornhub的宣傳效果甚是成功,也與其近期給出的流量報告結果一致。
Shiny的部分僅能設定一個關鍵字,一個地區與時間區間,其中有趣的是關鍵字同是Pornhub,有許多地區在三月都有特別高的搜尋次數。
###大部分筆記皆在程式註解當中
應統報告mask01~03合起來為一個主題 01下載csv後讀入資料 02自動連結網站讀資料,不用畫csv,並在地圖上標示出經緯度,畫出地圖(leaflet套件) 03視覺化(熱力圖之類的可以不用進入R,他是html,但若要自動更新網頁內容需要用shiny的套件)
02直接讀取網址的資料,只要重新執行程式就可以更新資料 加分:不同的圖示表示資訊,例如美食地圖以美式、日式、韓式分類,用不同圖示標示 熱力圖:那些地方發生頻率高就顏色深,反之 leaflet可能做不出來的:用新冠肺炎的確診數來畫出各縣市嚴重程度,但需要有鄉鎮縣市的邊界資料 ,套件未必有這麼細節提供台灣鄉鎮市區邊界的資料,若要那麼細緻可能沒辦法。 02mask2 是讀取.json檔案(需安裝其他套件) 02mask1 是讀取.csv檔案,因為政府的open data格式常常是.json 資料的處理越複雜,需要寫指令的就會越高分
r markdoen 的shinny沒有排版功能 選項一定在上面 結果一定在下面 程式較shinny app簡單 inputPanel 選項設定: sliderInput 拉BAR selectInput 選項-> choice=>選單內容 select=>預設值使用者選取後程式接受了會當成文字變數,但設定仍要寫成vector數值型 renderPlot 畫結果: 原來的 r 程式碼 需與inputPanel連結,將inputPanel內選好的參數回傳到R程式碼 數學:kernel density estimation 無母數估計
05-wordcloud2_shiny wordcloud2 非R內建的畫圖程式,是額外安裝的套件,要另外找shiny底下怎麼打 例如:leafler in r shiny
把R程式碼寫好,設計選項,找出shiny底下程式碼的寫法對不對,參數以name = input$name的方式做調整、修改
COVID19即時資訊:https://ourworldindata.org/coronavirus-data 國家邊界:應用統計 MOODLE -> leaflet_rworldmap